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Abstract 

This paper deals with the problem of parameter estimation based on certain eigenspaces 
of the empirical covariance matrix of an observed multidimensional time series, in the case 
where the time series dimension and the observation window grow to infinity at the same 
pace. In the area of large random matrix theory, recent contributions studied the behavior 
of the extreme eigenvalues of a random matrix and their associated eigenspaces when this 
matrix is subject to a fixed-rank perturbation. The present work is concerned with the 
situation where the parameters to be estimated determine the eigenspace structure of a 
certain fixed-rank perturbation of the empirical covariance matrix. An estimation algorithm 
in the spirit of the well-known MUSIC algorithm for parameter estimation is developed. It 
relies on an approach recently developed by Benaych-Georges and Nadakuditi relating 
the eigenspaces of extreme eigenvalues of the empirical covariance matrix with eigenspaces of 
the perturbation matrix. First and second order analyses of the new algorithm are performed. 

Keywords: Large Random Matrix Theory, MUSIC Algorithm, Extreme Eigenvalues, 
Finite Rank Perturbations. 



1. Introduction 

Parameter estimation algorithms based on the estimation of an eigenspace of the auto- 
correlation matrix of an observed multivariate time series are very popular in the areas of 
statistics and signal processing. Applications of such algorithms include the estimation of 
the angles of arrival of plane waves impinging on an array of antennas, the estimation of 
the frequencies of superimposed sine waves, or the resolution of multiple paths of a radio 
signal. Denoting by N the signal dimension (e.g., the number of antennas) and by n the 
length of the time observation window, the observed time series is represented by a JV x n 
random matrix £„ = X n + P n where X n and P n are respectively the so-called noise and 
signal matrices. In many applications, P n is represented as 

P n = B( i p u ---Vr)S* n , (1) 
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where (tpi, . . . , ip r ) are the r < min(.ZV, n) deterministic parameters to be estimated, B is 
a JV x r matrix of the form B{<p\,- ■ ■ cp r ) = [b(ipi) ■■■ b(ip r )] where b(ip) is a known 
C^-valued function of ip, and the S n is an unknown n x r matrix with rank r representing 
the signals transmitted by the r emitting sources. As usual (and unless stated otherwise), 
A* stands for the Hcrmitian adjoint of matrix A. It will be assumed in this work that this 
matrix is deterministic. Often, the noise matrix X n is a complex random matrix such that 
the real and imaginary parts of its elements are 2Nn independent random variables with 
common probability law Af(0, l/(2n)). In this case, we shall say that s/nX n is a standard 
Gaussian matrix. 

We shall consider here "direction of arrival" vector functions b(<p) that are typically met 
in the field of antenna processing. These functions are written 

b(i P )=N~ 1 / 2 [expHD^)]^" 1 

with domain ip £ [0,w/D] where D is a positive real constant and i 2 = — 1. Assuming 
that the angul ar p arameters ifk are all different, the well-known MUSIC (Multiple Signal 
Classification, [23, El) algorithm for estimating these parameters from E„ relies on the 
following simple idea: Assume that y/nX n is standard Gaussian and let II be the orthogonal 
projection matrix on the eigenspace of E£ n E* = BS^S n B* +In associated with the r largest 
eigenvalues, where In is the N x N identity matrix. Obviously, II is the orthogonal projector 
on the column space of B(<pi, . . . , <p r ). As a consequence, the angles ifk coincide with the 
zeros of the function b(ip)*(I — H)b(ip) on [0,n/D]. Since ||&(</?)|| = 1, they equivalently 
coincide with the maximum values (at one) of the so-called localization function x(v) — 

%)*n%). 

In practice, II is classically replaced with the orthogonal projection matrix II on the 
eigenspace associated with the r largest eigenvalues of E n E*. Assuming N is fixed and 
n — > 00, and assuming furthermore that S^S n converges to some matrix O > in this 
asymptotic regime, the EE* BOB* + In by the Law of Large Numbers (a.s. stands 
for almost surely). Hence, the random variable Xciassicai^) = b(ip)*Hb(<p) a.s. converges to 
x(v)j an d it is standard to estimate the arrival angles as local maxima of Xciassicai (</?)• 

However, in many practical situations, the signal dimension N and the window length n 
are of the same order of magnitude in which case the spectral norm of n — n is not small, as 
we shall see below. In these situations, it is often more relevant to assume that both N and 
n converge to infinity at the same pace, while the number of parameters r is kept fixed. The 
subject of this paper is to develop a new estimator better suited to this asymptotic regime, 
and to study its first and second order behavior with the help of large random matrix theory. 

In large random matrix theory, much has been said about the spectral behavior of X n X* 
in this asymptotic regime, for a wide range of statistical models for X n . In particular, it is 
frequent that the spectral measure of this matrix converge to a compactly supported limiting 
probability measure n, and that the extreme eigenvalues of X n X* a.s. converge to the edges 
of this support. Considering that E„ is the sum of X n and a fixed-rank perturbation, it is 
well-known that E„E* also has the limiting spectral measure 7r 0, Lemma 2.2]. However, 
the largest eigenvalues of E„E* have a special behavior: Under some conditions, these 
eigenvalues leave the support of tt, and in this case, their related eigenspaces give valuable 
information on the eigenspaces of P n . This paper shows how the angles ifk can be estimated 
from these eigenspaces. 

The problem of the behavior of the extreme eigenvalues of large random matrices sub- 
jected to additive or multiplicative low rank perturbations (often called "spiked models") 
have received a great deal of interest in the recent years. In this regard, the authors of 
0j HI l25j study the behavior of the extreme eigenvalues of a sample covariance matrix when 
the population covariance matrix has all but finitely many eigenvalues equal to one, a prob- 
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lem described in [2C§. Reference [13j is devoted to the extreme eigenvalues of a Wigncr 
matrix that incurs a fixed-rank additive perturbation. Fluctuations of these eigenvalues are 
studied in 0, |H |H H [H [H @] . 

Recently, Benaych-Georges and Nadakuditi proposed in [t| a powerful technique for 
characterizing the behavior of extreme eigenvalues and their associated eigenspaces for three 
generic spiked models: The models X n + P n and (J„ + P n )X n when both X n and P n are 
Hermitian and P n is low-rank, and the model that encompasses ours (X n + P n )(X„ + P„)* 
where X n and P n are rectangular. One feature of this approach is that it uncovers simple 
relations between the extreme eigenvalues and their associated eigenspaces on the one hand, 
and certain quadratic forms involving resolvents related with the non-perturbed matrix X n 
on the other. This makes the method particularly well-suited (but not limited to) the 
situation where X n is unitarily or bi-unitarily invariant, a situation that we shall consider 
in this paper. Indeed, in this situation, these quadratic forms exhibit a particularly simple 
behavior in the considered large dimensional asymptotic regime. 

In this paper, we make use of the approach of d,Q to develop a new subspace estimator of 
the angles ifk based on the eigenspaces of the isolated eigenvalues of £„£* . We perform the 
first and second order analyses of this estimator that we call the "Spike MUSIC" estimator. 
Our mathematical developments differ somehow from those of [H, Q and could have their own 
interest. They are based on two simple ingredients: The first is an analogue of the Poincarc- 
Nash inequality for the Haar distributed unitary matrices which has been recently discovered 
by Pastur and Vasilchuk and the second is a contour integration method by means of 
which the first and second order analyses are done. The key step of the second order analysis 
of our estimator lies in the establishment of a Central Limit Theorem on the quadratic forms 
b(<pi)*Hib((pi) where the IP, are the orthogonal projection matrices on certain eigenspaces of 
£„£* associated with the isolated eigenvalues. The employed technique can easily be used 
to study the fluctuations of projections of other types of vectors on these eigenspaces. 

We now state our general assumptions and introduce some notations. 



Assumptions and Notations 

We now state the general assumptions of the paper. Consider the sequence of N x n 
matrices £„ = X„ + P n where: 

Assumption Al. The dimensions N,n satisfy: N < n, n oo and 

N 



c 



n 



€ (0,1] 



(notation for this asymptotic regime: n — > oo). 



The following assumption on X n is widely used in the random matrix literature [18l |24| : 

Assumption A2. Matrices X n are random N x n bi-unitarily invariant matrices, i.e., each 
X n admits the singular value decomposition X n = L„r„i?* where L n , the N x N matrix 
r„ and R n are independent, L n is Haar distributed on the group U(N) of unitary N x N 
matrices, and R n is a n x N submatrix of a Haar distributed matrix on U(n). 



We recall that the Stieltjes transform of a probability measure 7r on the real line is the 
complex function 

m(z) = j - n(dt) , 



t 



analytic on C+ = {z : < <s(z) > 0}. 
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Assumption A3. Let Q n (z) = (X n X* — z/jv) -1 be the resolvent associated with X n X* 
and let a n (z) — N~ 1 trQ n (z). For every z G C+ ; a n (z) a.s. converges to a deterministic 
function m(z) which is the Stieltjes transform of a probability measure tt supported by the 
compact interval [A_,A+]. 

Assumption A4. The quantity ||Af„X*|| a.s. converges to A+ as n — > oo, where \\ ■ \\ denotes 
the spectral norm. 

Let Q n (z) = (X*X n — zln)^ 1 and a n (z) = n^ 1 tr Q n (z). Equivalently to the conver- 
gence assumed by Assumption IA3] one may assume that a n (z) a.s. converges on C+ to a 
deterministic function rh{z) which is the Stieltjes transform of a probability measure tt. In 
that case, fh(z) = cm(z) — (1 — c)/z and tt = ctt + (1 — c)Sq. 

Remark 1. In the areas of signal processing and communication theory, the noise matrix X n 
satis fying Assumptions A2-A4 is such that ^fnX n is standard Gaussian - see for instance 



We first make a general assumption on matrices P n ; it will be specified later, and adapted 
to the context of the MUSIC algorithm: 

Assumption A5. Matrices P„ are deterministic with a fixed rank equal to r for all n large 
enough. Denoting by P n = U n Q, n V* a singular value decomposition of P n , the matrix of 
singular values Q„ = diag(£Ji >n , . . . , u> r ,n) with 0J\ t n > W2, n > • • • > ^r,n converges to 



O = 



(2) 



where uj\ > ■ ■ ■ > ui s > and ji + ■ ■ ■ + j s 



Notations. 

As usual, if z € C, we shall denote by 5R(z) and Ss(z) its real and imaginary parts. We 

shall denote by °' s "> (resp. -^>, —¥) the almost sure convergence (resp. convergence in 
probability, in distribution). We denote by Sij the Kronecker delta (= 1 if i = j and 
otherwise). 

The eigenvalues of S n S*j are Ai in > A2, n > ••• > ^N.n- Associated eigenvectors will 
be denoted Wi, n , U2,nj ■ ■ ■ j "Ar,n- For k € {1, . . . , r}, we shall denote by i(k) the index i € 
{1, . . . , s} such that ji + • • • + ji-i < k < ji + • • • + ji- For i — 1, . . . , s, We shall denote 
by the orthogonal projection matrix on the eigenspace of Y, n T^ associated with the 
eigenvalues Xk, n such that i(k) = i, i.e., Hi. n — J2k-i{k)=i^k,nUk n when this eigenspace is 
defined. Columns of U n (see A5) will be denoted U\, n , ■ ■ ■ ,u r , n - Given i, the orthogonal 
projection matrix on the eigenspace of P n Pn associated with the eigenvalues io\ such that 
i(k) = i will be IL^ = Ylk'i(k)=i u fc-« M fc n - Indexes n and N will often be dropped for 
readability. 



Paper organization 

The paper is organized as follows. Section[2]is devoted to the mathematical preliminaries. 
The general approach is described in Section [3] The Spike MUSIC algorithm is presented in 
Section [4] along with a first order study of this algorithm. Fluctuations of the estimates of 
the ipk are studied in Section [5] under the form of a Central Limit Theorem. 
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2. Preliminary mathematical results 



We shall need the two following results. The first one is well-known 23j. The second 
result, due to Pastur and Vasilchuk, is the unitary analogue of the well-known Poincare-Nash 
inequality. 

Lemma 1. Let W = [tWy] be a random matrix Haar distributed on Lt(n). Then 

E [wijW* rj ,] = -5i,i'$j,j' ■ 

Lemma 2 ( [HI, Hit ) ■ Let $ : hi (n) — > C be a function that admits a C 1 continuation to an 
open neighborhood ofU(n) in the whole algebra of n x n complex matrices. Then 

1 n 

vai$(W n ) = E\<S>(W n )\ 2 - \E<S>(W n )\ 2 < - V E \$'(W n ) ■ (eje^W n ) | 2 

j,k=l 

where E is the expectation with respect to the Haar measure on U(n), where is the differ- 
ential o/$ as a function on M. 2n acting on the matrix ejeJW n seen as an element o/R 2n , 
and where ej = [0 • • • 1 • • • 0]* is the j canonical vector of C™. 

Given a small e x > 0, let O n be the probability event 

0„ = {||X n X:|| <A ++£l } . (3) 
By Assumption I A41 lo n —— > 1 as n — >• oo. 



Lemma 3. Let Assumvtion \A2\ holds true and let u, v be two unit norm deterministic N x 1 
vectors such that u*v = 0. Then for any z with 5ft(z) > A + + e\, 

E |lo„ X u* (Q(z) - a(z)I) u\ p < Np/2d{z Kp x+ + e l)P > 
E|lo„ x u*Q{z)v\ p < 



NP/ 2 d(z,X + + e 1 )P ' 

where the constant K p only depends onp, and where d(z, z') is the Euclidean distance between 
z and z' in C. 



Proof. Recall that X = LT R* by Assumption [A2] let D = (T 2 - zl)' 1 ; write: 



Wl w 2 ■ 

Thanks to A2, w\ and u>2 are the first two columns of el N x N unitary Haar distributed 
matrix W = [w tj ] independent of D. Let M = t 0n x (D - N~ l {txD)l) and <S> t (W) = 
wlMWi for i = 1,2. Then EQ^W) = E$ 2 {W) = by Lemmaffl Applying Lemma[lto 
after noticing that & t {W) ■ A = ejA*Mwi + w\MAei for any N x N matrix A, we obtain: 

1 N 

E|$ 4 | 2 = var($ t ) < - J2 E \ w ki[ MW hi + [W*M] ljWkl \ 2 , 

<|E(!|Mm J || 2 + ||Mu, 1 || 2 ) , 
8 

- Nd{z 1 X + + e 1 ) 2 ' 



ii 

v* 



(Q - aL) 



D 



trD 
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We now proceed by induction; assume that the result is true until p > 1. Applying Lemma 
CU to $^ +1 )/ 2 , we obtain: 



< 



1 N 

- y 



<^E 



2N 



(ii 



Mwi 



< 



2(p+l) 2 K p . 



d(z,\++e 1 )P+ 1 N(P+V/ 2 ' 
Using again the induction hypothesis, we get: 



iMioil 



E|$,-| p+1 = var( $, 2 



< 



2(p+l) 2 K p _ 1+ Kf p+1 



CP+X)/2 



d(z, A+ + £i )p+17V(p+i)/2 rf( Z; A+ + £i )p+17V(p+i)/2 



which concludes the proof. 



□ 



Lemma 4. Lei Assumption IA2I ZioW irite; tet it, v be two unit norm deterministic vectors 
with respective dimensions N x 1 and n x 1. TTien /or cm?/ z such as > A+ + ex, 



E 



lo„ x u*XQ{z)v 



< 



iP/ 2 d{z,\ + +£i)p" 



Proo/. Let C = r(r 2 -z/) -1 . By Assumption I A21 u*XQ(z)v = w*Cw = <&(w) where w is a 
vector uniformly distributed on the unit sphere of , w is a vector uniformly distributed on 
the unit sphere of C™ and truncated to its first N elements, and w, w and C are independent. 
The lemma is proved as above by applying Lemma [2] to $ and by taking the expectation 
with respect to the law of w. □ 

Lemma 5. Let Assumptions IA1HA4I hold true. Let C be a closed path of C such that 
min ze c 3?(z) > A+. Fix the integer r < N and let U n and V n be two deterministic isometry 
matrices with dimensions N x r and n x r respectively. Then 

sup\\U*(Q n (z)-m(z)I N )U n \\ 
zee 

sup || V;* (Q n (z) - m(z)I n ) V n \\ 
zee v ' 

sup||C/*X„Q„(2)V„|| 
zee 



-+ 0, 



n—?oQ 

a . s 



-+ o, 



> . 



Proof. Recall the definition ^ of the set O n and assume that E\ is chosen such that 
min 26 c %t(z) > A+ + e±; let 

h n (z) = lo„ x U* (Q„(z) - a n (z)L N ) U n . 

For any £,s < r, [h n ]e :S is a holomorphic function on C— [0, A + +£i]. Consider a dcnumcrable 
sequence of points (zk) in C — [0, A + + si] with an accumulation point in that set. By Lemma 
|3]with p — 3, Markov inequality and Borel-Cantclli's lemma, there exists a probability one 
set on which \h n (zk)]i }S — > for every k. Moreover, the |[/i n (^fc)]^,«| are uniformly bounded 
on any compact set of C — [0, A+ + £i]. By the normal family theorem, every n-sequence 
of [/i n ]£,s contains a further subsequence which converges uniformly on the compact set 
C C C — [0, A + +£i] to a holomorphic function that we denote h*. Since h*(zk) = for all k, 



G 



h*(z) — on C, hence |[/i n (-z)]^,s| converges uniformly to zero on C with probability one, and 
thanks to Assumption I A4[ \\U* (Q(z) — a(z)I) U\\ — > uniformly on C with probability one. 
The same argument, used in conjunction with Assumption IA3l shows that with probability 
one, a{z) — m(z) uniformly on C, and the first assertion is proven. The second and third 
assertions are proven similarly, the third being obtained with the help of Lemma |4j □ 



3. Fixed Rank Perturbations: First Order Behavior 



We first recall a result on matrix analysis that can be found in [19|, Th. 7.3.7] 
Lemma 6. Given a N x n matrix A with N < n, let A be the matrix: 

" A 
A* 

Then o~\, ■ ■ ■ , ctat are the singular values of A if and only if oi, • • • , ctat, — ai, • • • , — ctat in 
addition to n — N zeros are the eigenvalues of A. Furthermore, a pair (u,v) of unit norm 
vectors is a pair of (left, right) singular vectors of A associated with the singular value a if 
"«/\/2l 



and only if 



v/V2 



is a unit norm eigenvector of A associated with the eigenvalue a. 



Along the ideas in [8|, |9( , we now characterize the behavior of the largest eigenvalues of 
EE*, and then focus on their eigenspaces. 

Asymptotic behavior of the largest eigenvalues o/EE* 

We start with an informal description of the approach. By Lemma [6j A is an eigenvalue 
of EE* if and only if det(S - y/XI) = where S = 







s = 



" 


X' 




~u 


" 




' l,' 




~u* 





X* 





+ 





vn 




I r 







nv* 



Writing: 



= X + BJB* 



(4) 



and assuming that x > is not a singular value of X, we have: 

det(S - xl) = det(X - xl + BJB*) = det(J) dct(X - xl) det(J + B*(X - xI)- l B) , 
after noticing that J = J . Using the formula for the inversion of a partitioned matrix (see 



"All 


Al2 


-1 


{A 11 -A 12 A^A\ 2 )- 1 


-A^A 12 (A 22 - A* 12 A^A 12 )- r 


A* 


A 2 2_ 




_-(A 22 -A* 12 A^A 12 )-^A* 12 A^ x 


{A 22 -A* 12 A^A 12 y l 



we obtain: 



Q(a:) = (X - xiy 



Therefore, 
where 



H n (x) 



1 _ 


'-xl X ' 


-1 


xQ(x 2 ) XQ(x 2 ) 




X* -xl 




Q{x 2 )X* xQ{x 2 ) 


xl) 


= det(J) det(X - 2 


I)detH(x) , 




xU*Q(x 2 )U 


I r - 


\- U*XQ(x 2 )Vn 





(5) 



i r + nv*Q(x 2 )x*u x nv*Q(x 2 )vn 
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whence for n large enough, the isolated eigenvalues of EE* above A+ will coincide with the 
zeros of detH(y / x) that lie above A+. Under Assumptions IA1TIA51 Lemma [5] shows that 
H{x) a.s. converges to 

xm(x 2 )I r I r 
I r xrh(x 2 )0 2 



H(x) 



Consider the equation 

det H(y/x) = det (xm(x)rh(x)0 2 - I r ) = , 
and notice that the function 



g(x) = xm{x)fh(x) = x 



1 



t — x 



w(dt) 



1 



t — x 



Tr{dt) 



(6) 



decreases from g{\\) = linx^Xf. g(x) to zero on (A+, oo). Let ui'f > ■ ■ ■ > ui 2 q be those among 
the diagonal elements of O 2 that satisfy w| > l/g(Xt). Equation g(x) = w~ 2 will have 
a unique solution x — pi > X + for any i = 1, ■ ■ • ,q, while it will have no solution larger 
than A + for i > q. It is then expected that any eigenvalue \k, n of E„E* for which i(k) < q 
(remember the definition of i(k) provided in the paragraph "Assumptions and Notations" in 
ScctionQ]), will converge to pi, while A^-i \-j q +i,n A+ almost surely. 

These facts are formalized in the following theorem, shown in 0, H[ : 

Theorem 1. Let Assumptions [AT1IA5I hold true; let q be the maximum index such that 
uj 2 > l/g(X+). Let pi be the unique real number > A + satisfying oj 2 g{pi) = 1 for i = 1, • • • , q. 
Then 

a.s. 

A h + ---+3i-i+e,n > Pi 



for * = !,••• , q and £ = 1 , ■ ■ • ,ji while 



A 



JiH hj q + l.n 



-> A, 



In the case where \pn~X is a standard Gaussian matrix, 7r is the Marcenko-Pastur distri- 
bution with support supp(7r) = [A_, A+] = [(1 — \/c) 2 , (1 + y/c) 2 }, and 



mix) 



1 

lex 



■+y/[\-c-xf 



4cx 



(7) 



for x £ (A_|_, oo). After a few derivations, we obtain: 



Corollary 1. Assume y/nX is standard Gaussian. Let q be the maximum index such that 
uj 2 > \fc. Then 

- a.s. (cj? + l)(w?+c) 

A jH hw-i+An ► 9 J° r * — t, . . . , q , 



id A 



ilH hjg + l," 



■* (i + V^) 5 



We now turn our attention to the eigenspaces of the isolated eigenvalues. 



Asymptotic behavior of certain bilinear forms. 

Recall the definition of s as provided in Assumption IA5I Given i < s, assume that 
uj 2 > l/p(At). Given two Af x 1 deterministic sequences of vectors b\^ n and 62, n with 
bounded norms, we shall find here a simple asymptotic relation between b\ ra n ijr j6 2 ,n an d 
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b\ n Hi, n b2,n, that will be at the basis of the Spike MUSIC algorithm. A close problem has 
been considered in We consider here a different technique, based on a contour integration 
and on the use of Lemmas [3] and 2] This method lends itself easily to the first and second 
order analyses of the Spike MUSIC algorithm that we shall develop in the following sections. 



Writing 



with i = 1,2, we have by virtue of Lemma |5J 



6TIL& 2 



— / bl (S -z/) _1 b 2 dz 



Ci,„ 



where is a positively oriented circle that encloses the only singular values y \k, n of £„ 
for which i(k) = i. Recalling (@| and using Woodbury's identity ([3, §0.7.4]) together with 
the fact that J = J , we obtain: 

&*IL& 2 = — £ KQ(z)b 2 dz 

+ —£ b* 1 Q(z)B (J + B*Q(z)By 1 B*Q(z)b 2 dz . 
Using ([5]), we obtain after a straightforward calculation: 

whcrcQ 



— / bl „Q„(z)b 2 .„ dz + — I a\ (z)H n (z) 1 d 2 . n (z)dz (8) 



de,n{z) 



zU*Q n (z 2 ) 

n n v*Q n {z 2 )xi 



h* 



zQ n (z 2 )U n X n Q n (z 2 )V n fl n 



(9) 



Intuitively, the first integral is zero for n large enough and the second is close to 

1 



a\ Jz)H(z) 1 a 2 , n {z)dz 



where 7$ is a small enough positively oriented circle which does not meet the image of supp(7r) 
by x 1— > \px nor any of the y/~pi and such that only ^fpi € Int(7i), the interior of the disk 
defined by 74 (see Figure[T|), a* tn (z) = b\ n [zm(z 2 )U n 0] , and 



ai.n( z ) 



zm(z 2 )U* 




The approximation b\Ylib 2 — Ti will be justified rigorously below. For the moment, let 
us develop the expression of Tj. Defining the r x r matrices: 





1 Notice that (2) as defined is not the Hermitian adjoint of ag n (z). Despite this ambiguity, we 
introduce this notation which remains natural and widespread in Signal Processing. 
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Figure 1: The contour 74 w.r.t. the support of the limit singular value distribution of X n and the other ^fpi- 



where the integers ji are defined in Assumption IA51 we have 

1 



z 2 m(z 2 )rh(z 2 )u) 2 — 1 

z z m{z 2 ) 2 m{z 2 )w 2 



zrh(z 2 )uj 2 — 1 
— 1 zm(z 2 ) 



which leads to 



■ 1; 



(10) 



z 2 m(z 2 )rh(z 2 )uj 2 — 1 
wm(w) 2 rh(w)uj 2 



1 wm{w)rh(w)u) 2 



1 



dz 



dw 



by making the change of variable w = z 2 . Observe that the path 7^ now encloses pi only. 
Recall that wm(w)rh(w)uj 2 — 1 = if and only if w — pi for every £ such that uj 2 > 1/<7(A+), 
and since g{w) = wm(w)rh(w) is decreasing on (A + , 00), these zeros are simple. As a result, 
the integrals above are equal to zero for £ ^ i, and the integrand has a simple pole at w = pi 
for £ = i. By the Residue Theorem, we have: 



rp 1 / */,rr/ N-l / s , P l m{p l ) 2 m( Pi ) 

Ti= — f ai (z)H(z) a 2 (z)dz=- 

(p i m(p i )m(p l )) 



(ii) 



where the denominator at the right hand side is the derivative of the function A 1— > Am(A)m(A) 
at A = pi. We now make this argument more rigorous: 



Theorem 2. Let Assumptions IA1HA5I hold true. For a given i < s, assume that us 2 > 
1/<?(A+). Let (&i, n ) and (&2,n) be two sequences of deterministic vectors with bounded norms. 
Then 

h* ff h PMPi) 2 ™(Pi) h * n h 



{p l m{pi)rh{p i ))' 



-> 



Proof. Write 



T; 



1 

VK 



a* 1 {z)H{z) 1 a2{z)dz 



Then, with probability one, 6|IT&2 = Ti for n large enough. Indeed, on the set O n (as 
defined in ([3])), the singular values of £ greater than + s\ coincide with the poles of 

H{z) which are greater than U A+ + E\ by the argument preceding Theorem [TJ On this set, 
the first integral on the right hand side (r.h.s.) of (JHJ) is zero, and by Theorem[TJ the second 
integral can be replaced with J with probability one for n large enough. By Lemma [SJ the 

differences H(z) — H{z), a\{z) — a\{z) 1 and 02(2) — 02(2) a.s. converge to zero, uniformly on 
7;. Hence % - T t ^ 0. □ 
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4. The Spike MUSIC Estimation Algorithm 



Algorithm description 

We now consider the application context described in the introduction, and assume 
that P n = B n (ipi,...,ip r )S* where B n (tp x , . . . , <p r ) = [b n {<pi) • • • b n (tpk)] , and b n (ip) = 
jy— 1/2 [cxp(— iDlip)\ ( _ Q with domain <p £ [0,tv/D]. When the tp k are different, one can 
check that B* n B n —> I r as n — >• oo. In most practical cases of interest, 5*5 n — ► O 2 where O 
is given by Equation ((5}. In these conditions, due to B* n B n — ► J r , the diagonal elements of 
O are the limits of the singular values of P n and Assumption I A5I holds true. 

In the area of signal processing, the positive real numbers uif are called the Signal to 
Noise Ratios (SNR) associated with the r sources. Assumption IA5I becomes: 

Assumption A6. Matrices P„ of dimension N x n are deterministic and are written: 

P n = B n (<pi, ■ ■ ■ fr)S* 

where r is a fixed integer, B n (ipi, ■ ■ ■ ip r ) = \b n (ipi) ■■■ b n (<p r )] is a N x r matrix, 

b n {<p) = AT-V2 [ eX p(-zD^)]^l~ on if € [0,ir/D], and the tpk are all different. Matrix 
S n of dimensions n x r satisfies: 

Ms*s n - o 2 ) = 0(1) 

as n — > oo, where O is defined in Assumption I A5l and O is the classical Landau notation. 



The assumption over the speed of convergence of S*S will be needed only for the purpose 
of the second order analysis. It is satisfied by most practical systems met in the field of signal 
processing. We moreover observe that it is possible to relax the assumption that O is diagonal 
at the expense of a more complicated second order analysis. 

In order for the algorithm to be able to estimate the r angles, it is necessary that the 
perturbation P gives rise to r isolated eigenvalues, a fact that is stated in the following 
assumption: 

Assumption A7. Recall the definition ([6]) of function g, let A + as defined m lA3l and let 
<?(A^) = \\m x i\ + g(x). Let the u>i 's as defined m lA5l then: 

1 



The Spike MUSIC algorithm goes like this. The localization function xi'fi) defined in 
the introduction is also written as xif) = Si=i b(ip)*Hib(ip). Given <p, the results of the 
previous section (Theorems Q] and [5] with &i = 62 = b((p)) show us that: 

r 

Xn(<P) = ^2 l & n( ( ^)*"fc,«| 2 C(Afe,, l ) , (12) 
fc=l 



C(A) = KO'PVti (13) 



where 

(Am(A)m(A)) / 
Am(A) 2 m(A) 

is a consistent estimator of Xn(<p) in the asymptotic regime described bv lAll By searching 
for the maxima of x(</?), we infer that we obtain consistent estimates of the angles or arrival. 
Observe that this algorithm requires the knowledge of the Stieltjes Transform of the limit 
spectral measure of XX* (available if the statistical description of the noise is known) and 
the number r of emitting sources. Notice that when this number is unknown, it can be 



estimated along the ideas described in e.g. [lfl |22|. 



We now perform the first order analysis of this algorithm. 
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First order analysis of the Spike MUSIC algorithm 

We now formalize the argument of the previous paragraph and we push it further to show 
the consistency "up to the order n" of the Spike MUSIC estimator. Wc shall need this speed 
to perform the second order analysis (Lemma [S] below). 

Theorem 3. Let Assumptions [ATTIA6I hold true. Then for all k = 1, • • • , r, there exists a 
local maximum tfk,n °f Xn(v) such that 

n((pk,n ~ <Pk) — > 0. 



The proof of this theorem is performed in two steps. With an approach similar to the 
one used in Section[3J we first prove that x(tp) — x{f) 0) an d the convergence is uniform 
on ip <G [0,7t/D] (Proposition Q] below). Next, following the technique of 1(| 17 1, we prove 
that this uniform a.s. convergence leads to Theorem [3J 



In the sequel, we write: 

a(z,ip) = 



b(tp) and a(z, tp) 



zm{z 2 )U* 




(14) 



zU*Q(z 2 ) 

nv*Q(z 2 )x* 

a*(z,<p) = b*{f)[zQ(z 2 )U XQ{z 2 )Vil] , 
a*{z,ip) = b{f)[zm{z 2 )U 0] . 

Beware that a* and a* are not the Hermitian adjoints of a and a (see the footnote associated 
to Eq. ©). 



Proposition 1. In the setting of Theorem^ 

\Xn(<p) ~ Xn(<p)\ 



max 

V e[0,7r/D] 



-> . 



Proof. Write 

r s 

x(<p) - x(v>) = - c(ft(fc)))i6(v)*fi*i a + E (c(p*)%m%) - %rn 4 %) 



fc=l i=l 



By Theorem[T]and the continuity of £ on (A+, +oo), the first term at the r.h.s. goes to zero a.s. 
and uniformly in f. Consider the second term. Let 73 be a small enough positively oriented 
circle which does not meet supp(7r) U {^/pi, • • • , y/pl} and such that only yfpl £ Int(7i). 
Since A fe -^4 



max max 

i tp 



b(ipyiub{tp) - Ti(<p) 







a.s. for n large enough, where 



Ti(<p) = — <p a* (z , ip) H (z) 1 a(z,if) dz 

Recalling Eq. (|llj). it will therefore be enough to prove that 

max max a ' s ' > , 

l<i<s ipe[0,ir/D] ri-s-oo 

where 

Zi{ip) = ^-j (a*(z,ip)H(z)- 1 a(z,^)-a*(z,ip)H(z)- 1 a(z,i P )^ dz 



12 



Wc have 



max\Zi(<p)\ < 2R [ max e(y/p~ + Re 2l7r9 , ip) d6 



where R is the radius of 7, and where 

e(z, <p) = a*(z, <p)H{z)~ 1 d{z, <p) — a*(z, tp)H(z)~ 1 a(z, ip) 

<\(a* - a^H^al + \aH-\a - a)\ + a*(H _1 - H'^a 
Since max v ||a|| and max v ||a|| are bounded on 7^, e{z,ip) satisfies on this path 

e(z,<p) < K (\\a(z,<p) - a(z, ^)|| + ll^)" 1 - 



By Lemma[5]and the fact that is bounded on -fi, the term — = \\H~ l (H — 

H)H~ 1 \\ converges to zero uniformly on 7, with probability one. To obtain the result, we 
prove that \\a — a\\ and that this convergence is uniform on (z, ip) £ 74 x [0, tt/D]. Let 
us focus on the first term zu\(Q(z 2 ) — m(z 2 )I)b(p) of a — a, where we recall that u\ is the 
first column of U . Since || &(<£>) || = ||ui|| = 1, 

\zul(Q(z 2 ) - m(z 2 )I)b(p)\ < \zul(Q(z 2 ) - a(z 2 )I)b(p)\ + \z(a(z 2 ) - m(z 2 ))\ . 

With probability one, the second term converges to zero on 7,, and the convergence is uniform 
(along the principle of the proof of Lemma [5]). Since 



supmax||n 1 6'(<^)|| = supmax n 1 N x l 2 [^Dexp(— ilDp)^_^ 



< 00 



the term 
satisfies 



£0, ip) = 1 „ x zuKQiz 2 ) - a{z 2 )I)b(<p) 



\€(zx,<Pi) - £( z 2, < K(n\ipi -p 2 \ + \zi- z 2 \) 
for every (zi,(pi), (22, ^2) in 7i x [0,7r/D]. Therefore, it will be enough to prove that 



max ^(z,<p) — — ■> 

(z,jj)£i„xB„ n->oo 



where A n contains n regularly spaced points in 7, and B n contains n 2 regularly spaced points 
in [0, 7r/D]. This can be obtained from Lemma [3] with p = 9, Markov inequality and Borcl 
Cantclli's lemma. The other terms of d — a can be handled similarly. □ 

We now prove Theorem [3l by following the ideas of ljl 17 1. To that end, we need the 
following lemma, proven in [14| : 

Lemma 7. Let (cn) be a sequence of real numbers belonging to a compact of [—1/2,1/2] 
and converging to c. Let 

'(c N ) = — ^ cxp(-2iirkc N ) . 



k=0 



Then the following hold true: 



Qn{ c n) 



N-too 



> ifc^Q, 



N- 



N- 



-> if c = and N\cn — c\ — > 00 , 

-> exp(— iird) sinc(d) if c = and N\c^ — c\ — > d , 



where sine stands as usual for sine cardinal. 
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Proof of Theorem^ We start by observing that x{<p) = d(tp)*(B*B) 1 d(ip) where B is the 
matrix defined in lA6l and where d(ip) = [b(<pk)*b(tp)] , _ . By Lemma[7J B*B — > 7 r , hence 

xM-IMMII 2 ^o. 

In the remainder of the proof, we shall stay in the probability one set where the uniform 
convergence in the statement of Proposition [JJ holds true. Taking k = 1 without loss of 
generality, we shall show that any sequence (pi in for which x(^i,n) attains its maximum in 
the closure of a small neighborhood of <p\ satisfies N{ip\ <n — ipi) 0. Given a sequence of 
such <pi t m assume we can extract a subsequence tpi, n * such that N\(pi }Tl * — fi\ — > oo. In 
this case, Lemma [7J and the observations made above on the structure of x(f) show that 
x{<Pi,n*) -> 0. Since max^ \x[f) — 0- xOpi.n*) 0. But -> x(v?i) = lj which 

contradicts the fact that tpi n * maximizes Hence the sequence N(ipi^ n * — ipi) belongs to 
a compact. Assume N(<pi n * — ipi) -/> 0. If we take a further subsequence of the latter that 
converges to a constant d ^ 0, then by Lemma [71 x converges to sinc(d) 2 < 1 along this 
subsequence, which also raises a contradiction. This proves the theorem. □ 



5. Second Order Analysis of the Spike MUSIC Estimator 

In order to perform the second order analysis, we also assume: 

Assumption A8. Let A_, A+, a and m be as in IA31 Then for any z € C — [A_,A+], 
\fn(a.(z) — m(z)) converges in probability to zero. 

Remark 2. If y/nX is standard Gaussian and if c n = N/n satisfies \pn(c n — c) — > 0, then 
Assumption IA81 is satisfied. Indeed, call m n {z) the Stieltjes Transform of the Marcenko- 
Pastur distribution, i.e., the analytic continuation of (|7[), when c is replaced with c„, and let 
7r„ be the associated probability measure. For zgC - [A_, A+], function f(x) = (x — z)^ 1 
is analytic outside the support of n n for n large, and U Th.1.1] can be applied to show that 

\fn{a n {z) — m n (z)) — > 0. When \fn(c n — c) — > 0, it is furthermore clear that v / n(m„(z) — 
m{z)) -> 0. 

The main result of this section is the following: 
Theorem 4. Let Assumvtions [AT§ A8\ hold true. Then the estimates <fik,n satisfy 

01,71 - fl 



}Pr,n *Pt 

where 

2 6 / m'(pi) -m{ Pi ) 2 2,,s, *i \\\ 1 / - ^ 

cM^ +u; i (m(p i )+Pim(p i ))j, l<z<s. 

When sfnX is standard Gaussian, plugging the r.h.s. of (J7J into this expression leads 
after some derivations to: 

Corollary 2. If y/nX is standard Gaussian and if y/n(c n — c) — > 0, the convergence (|15|) 
holds true with 

2 _ 6 u\ + 1 
° l ~ c 2 D 2 u}f-c' 

This corollary calls for some comments: 



0, 



a{I n 



(15) 
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Figure 2: Spike MUSIC algorithm, Variance vs N. 



Remark 3 (Efficiency at high SNR). Recalling that u> 2 > s/c is the condition for the ex- 
istence of a corresponding isolated eigenvalue ( Corollary Qp, we observe that the estimator 
variance for (fk goes to infinity as the corresponding oj 2 decreases to s/c. At the other ex- 
treme, this variance behaves like 6c~ 2 D~ 2 ui~ 2 as ujf — > oo. It is useful to notice that this 
asymptotic variance coincides with the Cramer-Rao bound for estimating ifk 111/- I 71 other 
words, the Spike MUSIC estimator is efficient at high SNR when the noise matrix is standard 
Gaussian. 



A numerical illustration 

In order to illustrate the convergence and the fluctuations of the Spike MUSIC algorithm, 
we simulate a radio signal transmission satisfying Assumptions IA1HA81 We consider r = 2 
emitting sources located at the angles 0.5 and 1 radian, and a number of receiving antennas 
ranging from N = 5 to N — 50. The observation window length is set to n — 2N (hence 
c = 0.5). The noise matrix X n is such that y/nX n is standard Gaussian. The source powers 
are assumed equal, so that the matrix O given by Equation ([2]) is written O = <jjI%, and the 
Signal to Noise Ratio for any source is SNR = 101og 10 o; 2 decibels. In Figure the SNR is 
set to 10 dB, and the empirical variance of (fi, n — fx (red curve) is computed over 2000 runs. 
The variance provided by Corollary[2]is also plotted versus N. We observe a good fit between 
the variance predicted by Corollary [5] and the empirical variance after N = 15 antennas. In 
Figure [3l the variance is plotted as a function of the SNR, the number of antennas being 
fixed to N — 20. The empirical variance is computed over 5000 runs. The Cramcr-Rao 
Bound is also plotted. The empirical variance fits the theoretical one from SNR w 6 dB 
upwards. 

Proof of Theorem [^} 

We start with some additional notations and definitions. Matrix B = \b(cpi), . . . , b((p r )~\ 
will be often written as B — [61, . . . , b r ] or in block form as B = \B\, . . . , i? a ] where Bi has ji 
columns. We shall also write B' = \b'(cpi), . . . , b'(ip r )\ and B" = \b"(ipi), . . . , b"(cp r )] where 
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■ Var. (empirical) 

■ Var. (theory) 

■ CRB 



Figure 3: Spike MUSIC algorithm, Variance vs the SNR. 



b'(ip) and b"(<p) are respectively the first and second derivatives of b(<p). We shall also use 
the short hand notations B' = [b' v . . . , b' r ] and B" = [b'{, ...,&"]. Matrix B 1 - = [b^, . . . , b±] 
will be defined by the equation 



X -B> 

n 



icD cD 

2 2V3 



(16) 



0. 



Finally, if x n , y n are random sequences, we denote by x n x y n the convergence x n — y r , 

We now state some preliminary results. In the following, we say that the complex random 
vector 77 is governed by the law £A/"(0, R) where R is a nonnegative Hermitian matrix if the 



real vector 



has the law Af( 0, ^ 



. The following proposition, whose 



proof is postponed to |Appcndix A| is crucial: 

Proposition 2. Let Assumvtions ] Al l lA4l hold true. Let t < N be a fixed integer, let W = 
[uji,--- ,u>t] and W = [iDi , • • - , Wt\ be deterministic isometry matrices with dimensions 
N x t and n x t respectively. Let p be a real number such that p > A + . Then 



is tight. 



Vn = 



n(w*(Q(p)-a{p)I N )w, W*(q( P ) - a(p)I n )w, W*XQ{p)w) 

Pt/2 all strictly greater than A+, the t x 1 
\/n [w* k XQ{p k )w k 



Assume t is even. Given real numbers p± 
random vector 



N {wtQ{pk)Wt/2+k) x< 



L<fe<t/2 

converges in distribution towards CAf(0, R) with 

\*/ 2 



R 



diag (m'(pk) - MPk) 2 ) k=1 




l<fc<t/2 



til 

diag (m(pfe) + pkm {pk)) k=1 
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Writing Q — ml = (Q — a.1) + (a — m)I, and similarly for Q, we obtain: 
Corollary 3. Assume in addition that A ssumvtion I A8| is satisfied. Then 

Zn = V^(w*(Q(p)-m(p)I N )w, Vr{Q{p)-m{p)h)w, W*XQ(p)w) 

is tight. 

Intuitively, tightness of leads to the tightness of the ^/n(Xk, n — Pi(k)) • This is formalized 
by the following proposition, proven in | Appendix B| 

Proposition 3. Assume the setting of Theorem^ Then the sequences \/n(\k, n — Pilk)) are 
tight for 1 < k < r. 

The following lemma is proven in |Appcndix C| 

Lemma 8. Let Assumptions A5 and A6 hold true. Then the following convergences hold 
true: 

B*B > I r , 

n— >oo 

n z n-voo \ O / 

(B^YB 1 - > I r , 

n—>oo 

(B X )*B ► 0, 

\\Ui-UBiW > for alii = l,...,s 



where LT^ is the orthogonal projection matrix on the column space of B{. 
We now enter the proof of Theorem 2] 

Recall the definitions (fT2|) and (fT3|) of x and £. In most of the proof, we shall focus on 
\/n(<fi, n — ipi). Recalling that x'(0i) = an d performing a Taylor-Lagrange expansion of 
x' around tp± , we obtain 

o = = x'(^i) + (0i - vi)r(vi) + (yi ~/ l)2 x (3) (^i) , 

where x^ 3 -* is the third derivative of x an d where <pi € [ipi A (pi, (fx V <^i]. Hence 

3/2 r _ \ "~ 1/2 x'(yi) 

n W W n- 2 x"( V i)+O.5n- a (0i- V i)x( 3 )(^i) ' 

We start by characterizing the asymptotic behavior of the denominator of this equation: 
Lemma 9. Assume that the setting of Theorem^ holds true. Then, 

x"(yi) , ,« , x (3) (^i) a.s.< C 2 D 2 

Proof. We have 

= ^Ec(A fc )i(6ir^i 2 + A^ K ( c(Afc )^ fc ^ 

k=l k=l 

= ^(b' 1 )*UU% + ^U(blUU*b'() . (17) 
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Theorem [1] along with the continuity of C on (A+, oo), and Theorem [5] show that 



Writing 



we have 



zcL> . cD 



2^3 



2^3 



K) , 



C 2 i} 2 



by the first, fourth and fifth assertions of Lemma HI By the same lemma, 
4W/ - and -l&^i' . 

Hence rj- 2 x"(^i) -> -c 2 D 2 /6. 

Furthermore, it is easily seen that n~ 3 x^- 3 \(pi) is bounded. Since n((fi — (fx) ^—t by 
Theorem[31 n~ 2 (tpi — <^i)x^(</?i) 0, which establishes the result. □ 

We now turn to the numerator n -1 / 2 ^' '(fi) = IrT 1 ! 2 J2k=i C(Afc)^ (&*WfcMfc&i) > and start 
with the following lemma: 

Lemma 10. Assume that the setting of Theorem^ holds true. Then 

-L*^) _ 28(0 A , 



£ = E ^7= / («* ¥>i) - a*(z, ^Hizy'a'^z, dz, (18) 



and where the deterministic circle 7$ encloses p]/ 2 only and: 

zU*Q(z 2 ) 

nv*Q{z 2 )x* 

zm(z 2 )U* 



-i, \ da{z,(p) 



a'(z,(f) 



da(z, ip) 
dip 







Proof. Recall the definition of % as given in (fT^)) . A direct computation yields: 

r 

X'M = 2^C(A fc ,n)3?(6t(^^'i(^)) , 



k=l 



= 2 E E c(Afc,n)»(6i(^fifcfi3:6i(v)) ■ 

i=l k:i{k)=i 

Recall that r and s are fixed and independent from n by IA51 We start by showing that 



1 ->{ \ 2 

-i=x m) - —r 

Jn Jn 



s 

]Tc(p^(Wi) 



-> 0. 



(19) 



i=l 
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Since \/n(C(^k,n) — CO°i(fe))) i s tight as a corollary of Proposition^ it will be enough to prove 
that n _1 3? (blukU'^bi) — > in probability for every k. By the definition (fT6|) of 5^, we have 

I r n 

By Cauchy-Schwarz inequality, 

\Ku k u* k bi\ 2 < b\% {k) b x (bi)*ii m bi. 

By Theorem [5J 

6;ni (fc) 6i (bi)*u m bi - c(ft (fc )) -2 ^n. i(fe) &i (^)*n l(fe) ^ ^> o, 

and by Lemma [51 ^^n^fej^i (6 1 ")*n ? ( A .)6j L — > (consider alternatively the cases i(k) = 1 
and i(k) > 1) which proves (fH" 



Now, applying ([5]) and (HU), and taking up an argument used in the proof of Theorem [21 
we have 

2±™*^)=2±*(^f^ 0]Q«[*] dz ) 

+ 2j2n(J^£ a*(z,i Pl )H(z)- 1 a' tp (z, i p 1 )dz^ 

= 2j2n (J^L j£ a*(z, <p{)Q{z)-%{z, Vl ) dz 
with probability one for n large. On the other hand, recalling (fTTj) , we have 

s 

= xV) = 25> 



1 a*(z,^i)i?(z) x a' (z,(fi)dz I , 



which proves the result. 

Write H(z) = H(z) + E(z) and d(z, ip) = a(z, ip) + e(z, ip). To be more specific, 
E(z) 



zU*{Q{z 2 )-m{z 2 )I N )U U*XQ(z 2 )Vfl 

ttV*Q{z 2 )X*U zVLV*{Q{z 2 ) ~ m{z 2 )I n )Vn 



□ 



(20) 



and 



e(z,ip) 



~zU* (Q(z 2 J - m{z 2 )l)~ 
flV*Q(z 2 )X* 

Write e' ¥ (z,ip) = de(z,ip)/d<p. For a given z £ 7,-, if -1 = ff" 1 - H~ 1 EH~ 1 + 0(||-B|| 2 ). 
This suggests the following development 

fC(Pi) 



a*(z,ipi)H(z) e' (z,Lpi)dz 



1 e*(z,(pi)H(z) 1 d Jz,ipi)dz 



a*(z, ^ 1 )H{z)- 1 E{z)H{z)- 1 a' ip {z, <pi) dz + q^j 



J2( X hi + X 2,i + *3,i + ft) . 



where the terms qt are "higher order terms" that appear when we expand the r.h.s. of (p~8|) . 
We first handle the terms X k /s, then q L . 
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The terms X\^ 

Writing U n = \Ux <n ■ ■ ■ U Stn ] and V n = [Vi,„ • ■ ■ V s ,n] where both U^ n and V itn have ji 
columns, and recalling (|10[) . we have 



zm(z 2 )uj 2 —1 
— 1 zm{z 2 ) 



2 m(z 2 )rh(z 2 )uj 2 — 1 



«C7 (Q(z 2 )- m(z 2 )l) b[ 
u t VZQ(z 2 )X*b\ 



dz 



Op^y f z^ 2 m {z 2 )m{z 2 )b\^ g {Q(z 2 ) - m(z 2 )l) b[ ^ 
iTTy/n ^— J J li z 2 m(z 2 )m(z 2 )u! 2 — 1 

C(p0 /" ^zm(z 2 )6t^y/Q(z 2 )X*fe' 1 



z 2 m(z 2 )fn(z 2 )uj 2 — 1 



dz 



uiw 2 m(w;)ro(u>)&*iLj (Q(w) — m{w)I) b[ 
wm(w)rh(w)u) 2 — 1 



wm(w)rh(w)uj 2 — 1 



where 7,- encloses only. These integrals are zero for l^i. For large n and with probability 
one, none of the numerators has a pole within 7^, hence by the Residue Theorem 



a.s. for n large enough. 

Due to the bounded character of and to Corollary [31 Xxi is tight for every i. By 

Lemma [8j 



X\ a >; <5j_i 



6j (Q(pi) - m(pi)/) 6j ^E/^QQ^LY*^ 
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The terms X<i^ 
We have here 

Y CO*) 



^ 4 (Q(z 2 ) - 772(Z 2 )/) E£ ^XQ(z 2 )V^ 



zfh(z 2 )uj 2 — 1 
— 1 zm(z 2 ) 



C(Pi) 



z 2 m(z 2 )m(z 2 )uj 2 — 1 

wm(w)rh(w)ujjb\ (Q(w) — m(w)I) Hzb[ 
wm{w)rh(w)u> 2 — 1 

w*m(w)&*XQ(iu)^f7;&i 



zm{z 2 )U^b[ 




dz 



<7t<> 



2my/n ^-^ wm(w)rh(w)w 2 — 1 



&-1.0 



y/nm(pi) y/ri 

'b\ {Q{ Pi ) - m{ Pi )i) n^; u^XQO^W&i 



w.p. 1 for large n 



by Corollary |3] and Lemma [S] 



T7ie terms X^^ 

From ((TO]) and ([20]). we have 

C(Pi)_ 

Z77l(z 2 )w 2 



X; 



3,2 



^ [^m(z 2 )6f?7 0] 
J n _ »_i 

(8)2y ) E(i 



1 



-1 

zm(z 2 ) 



zfh(z 2 )w 2 — 1 
— 1 zm(z 2 ) 



(z 2 m(z 2 )rh(z 2 )uj 2 — l)(z 2 m(z 2 )m(z 2 )w 2 — 1) 

zm(z 2 )U*b[ 



dz 



CiPi) 



J) [w 2 z 2 m(z 2 )m(2; 2 ) -zm(z 2 )] 

v n hi „ , =1 



w £ 6in p xQ(z 2 )^t/;&' 1 



^6tC/ p ^,*Q(z 2 )X*n £ 6i zuy^£/ p F p *(Q(z 2 ) - m(z 2 ))W;&i 



(z 2 m(z 2 )m(z 2 )a; 2 — l)(z 2 m(z 2 )rh(z 2 )ui 2 — 1) 



E 



G pA w ) 



C(Pi) 

2m h ^J J - 1 {wm{w)fh(w)u) 2 — l)(wm(t/j)rn,(u;)u; 2 — 1) 



w 2 z 2 m{z 2 )rh{z 2 ) 
—zm{z 2 ) 

dw 



dz 
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where 



G p i(w) = n 1 I 2 (^ 2 uj 2 w 2 m(w) 2 m(w) 2 b\Xl p (Q{w) — m(w)I)Hgb\ 
- tu p Lujwm(w) 2 m(w) b* 1 U p V p *Q(w)X*Il e b' 1 



— uj p u)ewm(w) m(w)blIl p XQ(w)VeU(b' 1 

+uj p io e wm(wj 2 blU p V*(Q(w) - m(w)I)V e U;b' 1 

For large n and with probability one, the G p i(w) are holomorphic functions in a domain 
enclosing 7^, and G p e(w) does not cancel any of the terms of the denominator. The integrals 
of all terms in the sum such that p 7^ i and £ 7^ i arc zero. Each of the integrands of the terms 
p = i, £ 7^ i or p i, £ = i has a pole with degree one, and the corresponding integrals are of 
the form KuGu(pi) or K p iG P i(pi) where the Ku and K p i are real constants. By inspecting 
the expression of G p i and by using Corollary [3] and Lemma [5J it can be seen that these terms 
converge to zero in probability. It remains to study the term p = £ = i, which has a degree 2 
pole. Recalling that the residue of a meromorphic function f(z) that has a pole with degree 
2 at zq is lim z _> Zo d ((z — zo) 2 f(z)) /dz and letting gg(z) = zm(z)m(z)ui 2 — 1, the integral of 
this term is 



C(Pi) 



9i(Pi 



Thanks to Corollary and Lemma H R(G H (pi)) A 0. The same can be said about G' u (pi) 
after a simple modification of Proposition [2] and Corollary [3] In conclusion, 



Vi = 1, 



77ie terms qi 

These are the higher order terms that appear when we expand the right hand side of 
(fTSj) . We shall work here on one of these terms, namely 



C(Pi) 



a*(z, pi) (H(z)- 1 - H(z)- 1 + H{z)^ 1 E{z)H{z)~ 1 ^ a' v (z, Vl ) 



dz 



and show that e — > 0. The other higher order terms can be handled similarly. Writing 
2 = \/~Pi + i?exp(2z7r#) on the circle 7$, we have 



\e\<kjh / pre*)- 1 -#(«)- 

Jo 



H( Z y 1 E(z)H(z) 



d6 



where K is a constant whose value can change from line to line, but which remains inde- 
pendent from n. Let <p be a function from [0, 1] to a normed vector space. If (j) is twice 
differentiablc on (0, 1), then it is known that ||0(1) - (f)(0) - <p'(0)\\ < sup te(oa) O.5||0"(t)|| . 

Setting 4>(t) = (H + tE)' 1 and recalling that H = H + E, we have </>(!) = H, 0(0) = H 
and <f>"(t) = (H + tE)- l E{H + tE)~ 1 E(H + tE)^ 1 , hence 

\\Hiz)- 1 - H(z)- 1 + H(z)- 1 E(z)H(z)- 1 \\ < A'||£(z)|| 2 

for 2 G 7i- Write Q — ml = (Q — al) + (a — m)I and Q — ml = (Q — al) + (a — m)I, and 
decompose E as defined in (|2U)) as E = E\ + E2 where 



E 1 (z) 
E 2 (z) 



zU*{Q{z 2 l-a{z 2 )I N )U U*XQ(z 2 )Vn 

nV*Q(z 2 )X*U zVLV*{Q{z 2 ) - &(z 2 )I n )Vn 

zU*{a{z 2 )-m{z 2 ))I N )U 

zVtV*{a(z 2 ) -m(z 2 ))I n Vn 
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Consider any element of E\, for instance zu*(Q(z 2 ) — a(z 2 )I)ui. By Lemma|31 



/nE 



l 0n |u*(<9- a)ui\ 2 d9 



? A' 

mi El u Q-dk dfl<-F 
/o V™ 



which shows that y/n \\Ei\\ 2 d9 0. 



We now prove that J Q II^H 2 ^ A 0. In the space of probability measures on R 
endowed with the weak convergence metric, in order to prove that a sequence converges 
weakly to p, it is enough to prove that from any sequence, we can extract a subsequence 
along which the weak convergence to p holds true. We shall show along this principle 

that yfn Jq \\E2\\ 2 d9 A 0. Consider the term \fn(a — to). Let (z k ) be a denumerable 
sequence of points in C — [0, A+] with an accumulation point in that set. By IA8| from 
every sequence, there is subsequence ni such that y/ne,(a ne (z\) — m(z\)) — > almost surely 
(recall that the convergence in probability implies the a.s. convergence along a subsequence). 
By Cantor's diagonal argument, we can extract a subsequence (call it again ni) such that 
yjn~f (a ni (z k ) — m(z k )) — > almost surely for every fc. By the normal family theorem, there 
is a subsequence along which the function v / nf(a„ f — m) — > uniformly on 7, a.s. Repeating 

the argument for y/n(a — in), there is a subsequence ni along which \Jn~i Jq \\E2\\ 2 d8 
0, hence weakly. Necessarily, ^/ng \\E2\\ 2 d9 converges weakly to zero. Now since the 
weak convergence to a constant is equivalent to the convergence in probability to the same 
constant, we obtain the desired result. We have finally shown that: 



Vi = 1, ...,s, 



v 



Final derivations 

Write x = [x'(<Pi)j ■ • ■ j x'i'Pr)] ■ Generalizing the previous argument to all the ip k and 
gathering the results, we obtain 



n-^ 2 X ' x 23? 



K {Q(Pi(k)) - m{p m )l) b' k b* k (Q(p i{k )) - m(p m )l) U l(k) b' k 
Vnm{pn yk - ) ) y/nm(p i{k) ) 



J k=l 



cD_ 

7s 



m(p i(k) ) 



k=l 



By Lemma |H1 matrix A 



satisfies A* A — > I r . Recall from the same lemma 



k=l 



that B*B -> I r , (B^YB 1 - -> I r and (B ± )*B -> 0. Hence, Proposition [2] can be applied to 
the r.h.s. of this expression, and n^ x l 2 x' converges in law to 

( c 2 D 2 (m'{p i(k) ) - m{p i(k) ) 2 „ 

1 ' ~6~ g (, cm{p m ) 2 H^«) + Pl « m WW) 



fc=i, 



It remains to recall Lemmas |H] and [TO] to terminate the proof of Theorem 2J 



Appendix A. Proof of Proposition [2] 



The tightness of follows from Lemmas [3] and [4] with p = 2 and from the application of 
Chcbyshcv's inequality. 
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Let Z = [-^»,fe]j fct=i an d Z = [zi,k\7'k=i be -/V x i and nxt standard Gaussian random matrices 
chosen such that Z, Z and the N x N matrix T of singular values of X are independent. For 
k = 1, . . • ,t/2, let D fe = diag(d ?; , fe )ti = (r 2 - pu)' 1 and C fe = diagfc, h )*L t = T(T 2 - p k )-\ 
Then 



N 



{Z*Z)- 1 ' 2 Z* [D 



i l^)z{z*z)- 1 ' 2 



fe,fe+t/2/ fc=l,...,i/2 

>/n( (Z*Z)- 1 / 2 Z*C fe Z[l;^](Z*Z)- 1 / 2 



k=l,...,t/2 



where Z[l; N] is Z truncated to its first TV rows. By the Law of Large Numbers, N 1 Z* Z — >• 
It and n _1 Z*Z — > / t almost surely. Hence, if we show that the multidimensional random 
variables A fe> „ = N~ 1 / 2 Z*(D k - N^ttD^Z and B fc> „ = N-^ 2 Z*C k Z[l; N] are tight for 
k = 1, . . . , £/2, and 



Z*C fc Z[l;JV] 



fc=l,...,t/2 



converges in law towards £/V(0,i?), the second result of Proposition [2] is proven. From A3 
and A4, 



1 N 



k-'^)' = ±trQ( n f- (±tvQ( n )\ -^m'(p t )-m(p k f, and 

1=1 ^ ' 

J^J2 c lk = -fj^Q(Pk) + -^tTQ(p k ) 2 -^-> m(pk) + Pkm'(pk) 

i=l 

for all fc = l,...,t/2. Recalling that Z and Z are standard Gaussian, it results that 
limsup n E [||A fci „|| 2 || r n l and limsup n E [||-Bfc.„|| 2 || r„] are bounded w.p. 1 by a constant. 
Tightness of the A kn and B kn follows. Now we have 



v i—i 

i N 



N~ L trD 



Observe that covariance matrix of r\ n conditional to T n converges almost surely to R. More- 
over, thanks to A4, it is easy to see that the Lyapunov condition 



i n 

^E E [ii u -n 2(1+a) ii^ 
»=i 



-> 



is satisfied for any a > 0, hence ?7 n — » CAF(0, R) which completes the proof of Proposition^ 
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Appendix B. Sketch of the proof of Proposition [3j 

For k = 1, . . . , r, let pk, n be the solutions of the equation u 2 n g(p) = 1, where we recall 
that the uj 2 , are the diagonal elements of matrix f2„ . Then, by a simple extension to the 
case r > 1 of the proof of 0, Th. 2.15] , one can show that the sequences y/n(\k, n — Pk.n) are 
tight. To obtain the result, we show that \/n{pk.n — Pi(k)) = 0(1)- Since g is decreasing, this 
amounts to showing that v / n(oj^. n — w?/ fc ^) = 0(1). Since the non zero eigenvalues of PP* 
coincide with those of B*B S*S, it will be enough to prove that y/n(B*B S*S - O) = 0(1). 
It is clear that B*B = I r + rT 1 A where sup„ ||A|| < oo, hence y/n{B* BO - O) -> 0. By the 
last item in Assumption A6, \JnB* B(S* S — O) = 0(1), and the proposition is shown. 



Appendix C. Proof of Lemma [8l 

Observing that 

bffr) = =^ [texpi-iDfy)]^ 1 and = [£ 2 exp(-*D^)] ^ 



and using the fact that N~( K+1) Y^Lo* ^ exp(ia£) -> 5 afi /{K + 1) for a G [-7r,7r], we 
have 5*5 -> I r , n- 1 B*B' -> -{icD/2)I r , n~ 2 (B')*B' -4 (c 2 D 2 /3)I r , and n - 2 B*B" -> 
~(c 2 D 2 /3)I r . 

Writing B 1 - = 2^/~3(ncD)~ 1 B' + i^/3B and replacing in the above convergences, the stated 
properties of B become straightforward. 

We now show the last convergence. Assume without generality loss that i = 1 and recall that 
S*S -> O 2 . Consider the isometry matrices W = B{B*B)- 1 ' 2 and Z = S{S*S)- 1 / 2 , and 
let A = (B*B) 1 / 2 (S*S) 1/2 , resulting in P = WAZ*. Notice that the singular values of A 
coincide with those of P apart from the zeros. Let 7Ti be the orthogonal projection matrix on 
the eigenspace of AA* associated with the eigenvalues ojf n ,..., ui 2 n . With these notations, 



IT = W-niW* and U Bl = B 1 (BIB 1 )- 1 BI. We have 4->0, hence 7Ti 



I n 




Since 



B*B —> I, for any vector x such that ||x|| = 1, we have x*Uix — x*BiB^x —J- 0, and 
x*U Bl x - x*BiB{x 0. Therefore, x*(ILi - U Bl )x -> 0, which proves the last result. 
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